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We study the robustness of a fault-tolerant quantum computer subject to Gaussian non-Markovian 
quantum noise, and we show that scalable quantum computation is possible if the noise power 
spectrum satisfies an appropriate "threshold condition." Our condition is less sensitive to very- 
high-frequency noise than previously derived threshold conditions for non-Markovian noise. 
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I. INTRODUCTION 

The theory of fault-tolerant quantum computation shows that properly encoded quantum information can 
be protected against decoherence and processed reliably with imperfect hardware [1... Demonstrating that 
this theory really works in practice is one of the great challenges facing contemporary science. A large-scale 
fault-tolerant quantum computer would be a scientific milestone, and it should also be useful, capable of 
solving hard problems that are beyond the reach of ordinary digital computers. 

Though the theory of quantum fault tolerance strengthens our confidence that truly scalable quantum 
computers can be realized in the next few decades, failure is certainly possible. Perhaps the engineering 
challenges will prove to be so daunting, and the resources needed to overcome them so demanding, that 
society will be unable or unwilling to bear the cost for the foreseeable future. Perhaps new fundamental 
principles of physics, as yet undiscovered, will prevent large-scale quantum computers from behaving as 
currently accepted theory dictates. Finding that quantum computers fail for a fundamental reason would be 
a significant scientific advance, but would disappoint prospective users. 

There is a third reason to worry about the future prospects for fault-tolerant quantum computing. Math- 
ematical results establishing that fault tolerance works effectively are premised on assumptions about the 
properties of the noise. The most obvious requirement is that the noise must be sufficiently weak — if the 
noise strength is below a threshold of accuracy then quantum computing is scalable in principle. But in 
addition, the noise must be suitably local, both spatially and temporally. Perhaps the quest for a quantum 
computer will be frustrated because the noise afflicting actual hardware is just not amenable to fault-tolerant 
protocols. 

We can anticipate therefore that progress toward scalable quantum computing will require an ongoing 
dialog between experimenters who will better understand the limitations of their devices and theorists who 
will propose better ways to overcome the limitations and to evaluate the efficacy of these proposals. In the 
meantime, an important task for theorists is to broaden the range of noise models for which useful accuracy 
threshold theorems can be proven, and we pursue that task in this paper. Our main result is a new proof of 
the threshold theorem for non-Markovian Gaussian noise models, in which system qubits are locally coupled 
to bath variables that have Gaussian ffuctuations. Specifically, if the bath is a system of uncoupled harmonic 
oscillators, at either zero or nonzero temperature, our theorem expresses the threshold condition in terms of 
the power spectrum of the bath fluctuations. 

Early proofs of the threshold theorem |2J [31 13] assumed that the noise is Markovian. This means that 
each quantum gate in the noisy circuit can be modeled as a unitary transformation that acts jointly on 
a set of the qubits in the computer (the system qubits) and on the environment (the bath variables), but 
where it is assumed that the bath has no memory — the state of the bath is refreshed after every gate. The 
theorem was extended to a class of non-Markovian noise models in [5], and further generalized in [5] and 
[7]. The results of [6l[7] have the substantial virtue that the state of the bath and its internal dynamics can 
be arbitrary; for fault-tolerant quantum computing to work, it is only required that the bath couple weakly 
and locally to the system. 

However these results also have two serious drawbacks. First, the threshold condition is not easily related 
to experimentally accessible quantities; rather it requires terms in the Hamiltonian that couple the system to 
the bath to have a sufficiently small operator norm. Second, this condition severely constrains the very-high- 
frequency fluctuations of the bath. Intuitively, it seems that this constraint, which may limit the applicability 
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of the threshold theorem to noise in some reahstic settings, ought not to be necessary, since fluctuations with 
a time scale much shorter than the time it takes to execute a quantum gate tend to average out. 

One possible way to reach more pleasing conclusions is to make physically reasonable assumptions about 
the noise that go beyond the assumptions of fS', T; that is the approach we follow here. Our new threshold 
theorem applies to any noise model in which the bath variables are free fields (aside from their coupling 
to the system qubits), and expresses the threshold condition in terms of the bath's two-point correlation 
function, which is in principle measurable. It should be possible to extend our analysis to the case where 
the bath variables have sufficiently weak self-interactions, though we will not pursue that extension here. 
Furthermore, though our new threshold condition still requires the very-high-frequency bath fluctuations to 
be sufficiently weak, this requirement is considerably relaxed compared to previous threshold theorems that 
apply to non-Markovian noise. Showing that these requirements can be relaxed even further, perhaps by 
making additional physically motivated assumptions, is an important open problem. 

Experimenters use a variety of techniques to suppress the noise in quantum hardware, such as cleverly 
designed pulse sequences to improve the fidelity of quantum gates (spin echos, dynamical decoupling, etc.) 
and intrinsically robust encodings of quantum information (noiseless subsystems, topologically protected 
qubits, etc.). These techniques can be highly effective and are likely to be incorporated into the design of 
future quantum computers, but do not by themselves suffice to ensure the scalability of quantum comput- 
ing. After such tricks are exhausted some residual noise inevitably remains that must be controlled using 
quantum error-correcting codes and fault-tolerant methods. Since our objective in this paper is to study 
the effectiveness of these fault-tolerant methods, our noise models may be viewed as effective descriptions 
of this residual noise in "fundamental" quantum gates that might already be realized using complex and 
sophisticated protocols. 

After reviewing previously known formulations of the quantum accuracy threshold theorem in Sec. |lT](with 
some details relegated to Appendix A), we state our new result in Sec. Ill explore some of its implications 
in Sec. IV derive it in Sec. |V] and discuss some generalizations in Sec. VI We derive a sharper result for 
the case of pure dephasing noise in Sec. VII Sec. VIII contains our conclusions. 



II. NOISE MODELS AND QUANTUM ACCURACY THRESHOLD THEOREMS 

Here we will briefly review some previously know formulations of the quantum accuracy threshold theorem, 
and explain why these results still leave something to be desired. Then in Sec. |III| we will state our new 
result, which addresses some of the shortcomings of the previous results. 

The goal of fault-tolerant quantum computing is to simulate an ideal quantum circuit using the noisy 
gates that can be executed by actual devices. Theoretical results show that this goal is attainable if the 
noise is not too strong and not too strongly correlated. The essential trick that makes fault tolerance work 
is that the logical quantum state processed by the computer can be encoded very nonlocally, so that it is 
well protected from damage caused by local noise. 

It is convenient to analyze the effectiveness of a fault-tolerant noisy circuit by invoking a fault-path 
expansion; schematically, 

Noisy Circuit = ^ "Fault Path" . (1) 

Let us use the term location to speak of an operation in a quantum circuit that is performed in a single time 
step; a location may be a single-qubit or multi-qubit gate, a qubit preparation step, a qubit measurement, 
or the identity operation in the case of a qubit that is idle during the time step. In each fault path, the 
quantum gates are faulty at a specified set of locations in the circuit, while at all other locations the quantum 
gates are assumed to be ideal. We say that the faulty locations are "bad" and that the ideal locations are 
"good." The general concept of a fault-path expansion applies quite broadly, and different noise models can 
be distinguished according to how we flesh out the meaning of eq. (ITl . 
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A. Local stochastic noise 

In a "stochastic" noise model we assign a probability to each fault path We speak of local stochastic 
noise with strength e if, for any specified set Tr of r locations in the circuit, the sum P^'^'^{Tr) of the 
probabilities of all fault paths that are bad at all of these r locations satisfies 

P^'"^ilr) < . (2) 

In this noise model, no further restrictions are imposed on the noise, and in particular the trace-preserving 
quantum operation applied at the faulty locations may be chosen for each fault path by an adversary who 
wants the computation to fail. Thus the faults can be correlated, both spatially and temporally, but the 
adversary's power is limited because an attack on r specified circuit locations occurs with probability at 
most e'^. The noise is "local" in the sense that attacking each additional location suppresses the probability 
of the fault path by another power of e. 

Most proofs of the threshold theorem use recursive simulations. This means that quantum information 
is protected by a hierarchy of codes within codes, and that the fault-tolerant circuit has a self-similar 
structure. We refer to an unencoded quantum circuit as a "level-0" simulation. In a level-1 simulation, each 
elementary gate in the level-0 circuit is replaced by a level-1 gadget constructed from elementary gates; this 
1-gadget performs the appropriate encoded operation on logical qubits that are protected by a quantum 
error-correcting code C. In a level-2 simulation, each elementary gate in the ideal circuit is replaced by a 
level-2 gadget; the 2-gadget is constructed by replacing each elementary gate in the 1-gadget by a 1-gadget. 
A 2-gadget operates on quantum information protected by C>C, where > denotes code concatenation. (That 
is, Ci \>C2 is encoded by first encoding the "outer" code C2, and then encoding each qubit in the C2 block 
using the "inner" code Ci.) In a level-fc simulation, each elementary gate in the ideal circuit is replaced by a 
level-fc gadget, constructed by replacing each elementary gate in the (fc— l)-gadget by a 1-gadget; it operates 
on quantum information protected by C^^ . 

For local stochastic noise, and also for other noise models with suitable properties, a recursive simulation 
can be analyzed by a procedure called level reduction^ in which a level-A: simulation is mapped to a "coarse- 
grained" level-(fc— 1) simulation that acts on the top-level logical information in exactly the same way. 
Suppose for example that C is a distance-3 code that can correct one error. Then if the 1-gadgets are properly 
designed, each "good" 1-gadget that contains no more than one faulty location simulates the corresponding 
ideal gate correctly, while "bad" 1-gadgets with more than one fault may simulate the ideal gate incorrectly. 
In the level reduction step, for each fault path the good 1-gadgets are mapped to ideal level-0 gates, while the 
bad 1-gadgets are mapped to faulty level-0 gates. After this step, the resulting noisy circuit is still subject 
to local stochastic noise, but with a renormalized value of the noise strength 

= eV^o = £0 {e/sof . (3) 

The renormalized value of the noise strength is 0(£^), because at least two faults are required for a 1-gadget 
to fail; the quantity Eq ^ is a combinatoric factor counting the number of "malignant" sets of locations within 
the 1-gadget where faults can cause failure. 

Since level reduction maps local stochastic noise to local stochastic noise (but with a revised value of the 
noise strength), the level reduction step can be carried out repeatedly, and analyzed by the same method 
each time. That the structure of the noise is preserved, even though its strength is renormalized, is a useful 
feature of the local stochastic noise model not shared by some noise models. For example, if faults in level-0 
gates were independently and identically distributed, the effective noise model after one level reduction step 
would become correlated rather than independent. See [5J |5] for a more detailed discussion of the level 
reduction procedure. 

By repeating the level reduction step all together k times, we reduce the level-/c simulation to an effective 
level-0 (i.e., unencoded) simulation with noise strength 

et'^) = £0 {e/eof . (4) 

It follows that for £ < £0 (the accuracy threshold) , the effective noise strength becomes negligibly small for k 
sufficiently large, and the simulation becomes highly rehable. More precisely, for any fixed £ < £0 and fixed 
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5 > 0, an ideal circuit with L gates can be simulated with error probability 5 by a noisy circuit with L* 
gates, where for some constant c 

iV.^offS^V) (^) 



Vlog(eo/e)/ 

(The constant c is determined by the size of the 1-gadgets.) Thus, with reasonable overhead cost, the noisy 
simulation gets the right answer with high probability. This is the quantum accuracy threshold theorem for 
local stochastic noise. 

For the threshold theorem to apply, two features of the simulation are essential: First, we must assume 
that quantum gates can be executed in parallel — otherwise we would be unable to control storage errors that 
occur simultaneously in different parts of the computer. Second, we assume that qubits can be "discarded" 
and replaced by fresh qubits (for example, by measuring the qubits and resetting them) — otherwise we 
would be unable to flush from the computer the entropy introduced by noise. Estimates of the accuracy 
threshold Eq often rely on further assumptions. For example, if we assume that qubit measurements are as 
fast as quantum gates, that classical computations are arbitrarily accurate, that the accuracy of a two-qubit 
quantum gate does not depend on the spatial separation of the qubits, and that no data qubits "leak" from 
the computational Hilbert space, then it has been shown that Eq > .67 x 10~^ [9]. For noise models with 
weaker correlations than in the local stochastic noise model, the proven accuracy threshold is above 10~^ 
[Till [TT] , and numerical evidence suggests that the actual value of the threshold can be of order 1% [T^ [T^ . 
Furthermore, it has been shown that the threshold is not drastically reduced if some of these assumptions are 
relaxed, for example by allowing measurements to be slow [IT, allowing leakage |r5] . or requiring quantum 
gates to be local on a two-dimensional array [IB] . 



B. Local non-Markovian noise 



The local stochastic noise model is handy for analysis and has some quasi-realistic features, but it is still 
rather artificial. From a physics perspective, it is more natural to formulate the noise model in terms of a 
Hamiltonian H that governs the joint evolution of the system and the bath. We may express H as 

H = Hs + Hb + Hsb , (6) 

where Hs is the time-dependent system Hamiltonian that realizes the ideal quantum circuit, Hb is the 
(arbitrary) Hamiltonian of the bath, and HgB is a perturbation, responsible for the noise, that may couple 
the system to the bath. We say that such a noise model is non-Markovian, meaning that quantum information 
can escape from the system to the bath and then return to the system at a later time, so that the state of 
the system at time t + dt is not uniquely determined by its state at time t. Furthermore, Hsb may also 
contain terms that act nontrivially only on the system, representing unitary noise arising from imperfect 
control of the system Hamiltonian. Actually, the local stochastic noise model already incorporates some 
non-Markovian effects; even when fault paths are weighted by probabilities, the adversary who attacks the 
circuit might employ a quantum memory. But different methods are needed to analyze the consequences of 
Hamiltonian noise models, because fault paths arc summed coherently rather than stochastically. 

The locations in a quantum circuit include not only quantum gates and storage steps, but also qubit 
preparation and measurement steps. Preparation and measurement noise can be incorporated into a Hamil- 
tonian description by various means. In this paper we will take an especially simple approach, modeling 
an imperfect preparation by an ideal preparation followed by evolution governed by H, and modeling an 
imperfect measurement by an ideal measurement preceded by evolution governed by H. For the time being, 
to simplify the discussion, we will imagine that system qubits are prepared only at the very beginning of the 
computation and measured only at the very end. Preparations and measurements that occur at interme- 
diate times can easily be incorporated; we will elaborate on this point in Sec. |VI[ For the continuous-time 
Hamiltonian dynamics we are now considering, a "location" consists of a specified qubit or set of qubits to 
which a gate is applied, and a specified time interval during which that gate is realized by the ideal system 
Hamiltonian Hs- 
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We may say that the Hamiltonian noise model is "local" if the perturbation Hsb can be expressed as a 
sum of terms 

Hsb^Y^h'sL (7) 

a 

where each i?^"^ acts on only a small number of system qubits (while perhaps also acting collectively on 
many bath variables). The joint unitary time evolution operator Usb for system and bath, resulting from 
integrating the Schrodinger equation for Hamiltonian H, can be formally expanded to all orders in time- 
dependent perturbation theory in Hsb- In any fixed term in this expansion, perturbations chosen from 
the set {H^sb\ ^''^^ inserted at specified times. For such a fixed term in the perturbation expansion, let us 
say that a location in the (level-0) noisy simulation is "bad" if an inserted perturbation acts nontrivially 
somewhere inside that location; otherwise that location is "good." Of course, under this definition a single 
insertion of i/^.^ might cause two (or perhaps more) locations to be bad in a particular time step, if H^g^ 
acts collectively on two qubits that are undergoing different gates executed in parallel in the ideal circuit. 

As already noted, we may assume that the system qubits have been initialized ideally at the start of the 
Hamiltonian evolution; we denote this initial system state by \^%) ■ We also assume that the initial state of 
the bath is a pure state \^%)- There is really no loss of generality in supposing that the bath starts out in 
a pure state; if we wish to consider a mixed initial state of the bath instead (for example, a thermal state), 
we may include in the bath a "reference" system that "purifies" the mixed state. 

We have also noted that we may assume that final measurements performed on system qubits are ideal. 
Just before these final measurements are conducted, the (pure) state of the system and bath is 

\^sb) = Usb\^%b) , (8) 

where ~ l^s) ® \^%) '^^ initial state of system and bath. For any specified set Ir of r locations 

in the circuit, let us denote by the sum of all the terms in the formal perturbation expansion 

of sb) such that all of these r locations are bad. Then we speak of local non-Markovian noise (or more 
briefiy local noise) with strength e if 

m'^SB{Ir))\\<e^ . (9) 

The noise strength e can be related to properties of the perturbation Hs b ■ We will sometimes refer to this 
model as local coherent noise, to emphasize that (in contrast to the local stochastic noise model) fault paths 
are assigned amplitudes rather than probabilities. 

Although there are some new subtleties (see [B] and Appendix |A]) , the level-reduction concept can be 
applied to Hamiltonian noise models in much the same way as for stochastic models. We may say that a 
1-gadget is bad if it contains bad level-0 gates at a malignant set of locations, that a 2-gadget is bad if it 
contains bad 1-gadgets at a malignant set of locations, that a 3-gadget is bad if it contains bad 2-gadgets at 

(k) 

a malignant set of locations, and so on. For any specified set Jr of r fc-gadgets in the circuit, let us denote 
by |^'g^(Ir'^'')) the sum of all the terms in the formal perturbation expansion of l^'ss) such that all of these 
r fc-gadgets are bad. Then it follows from eq. ^ that 

msBi4''^))\\ < (e^'^y , (10) 



with e'*^'' as in eq. (|4|; the derivation of eq. ( 10 1 is sketched in Appendix A. Furthermore, a level-fc simulation 
in which all fc-gadgets are good simulates the ideal circuit perfectly. In this sense, repeated level reduction 
reduces a level-fc simulation to an equivalent level-0 simulation while mapping local noise to local noise with 
a renormalized noise strength e'-'"', and for e < Eq, the renormalized noise strength becomes negligible for 
large fc. The threshold value Eq of the noise strength for local noise is of the same order (though not exactly 
the same) as the threshold for local stochastic noise. We emphasize that, once eq. (|9| is established, we can 



derive eq. ( 10 1 without any further assumptions about the Hamiltonian H = Hs + Hb + Hsb- 

The strength e of local noise can be estimated based on the detailed properties of the expansion in eq. ([t]) 
of the perturbation Hsb in terms of local system operators. For example, in [B] the noise was assumed to 
be "short range" in the sense that the perturbation Hsb acts collectively on a pair of data qubits only while 
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the ideal system Hamiltonian Hs also couples those two data qubits — that is, only while the ideal quantum 
circuit calls for that pair of qubits to undergo a two-qubit gate. For this short-range local noise model, it 
was shown that eq. ([9| is satisfied if we choose 



e 



= imj,x\\H^,i{t)\\]-to, (11) 



where to is the time needed to execute a quantum gate, || • || denotes the sup operator norm, and the maximum 
is over all circuit locations and all times. On the other hand, in [7] the noise was assumed to be "long range" 
with HsB coupling each pair of data qubits irrespective of the structure of the ideal circuit. In that case we 
may write 



HsB = J2 ^<^^> ■ (12) 



<ij> 



where the sum is over all unordered pairs of system qubits; H^ijy acts collectively on the pair of qubits 
< ij > and also on the bath. For this long-range local noise model, it was shown that eq. (|9| is satisfied if 
we choose 



C- \uiaxJ2\\H<.j>\\j -to , (13) 
where C is the numerical constant C — 2e ^ (2.34)^. (This is actually a slight improvement over the value 



of C reported in [7]; the improved value can be derived using the reasoning described in Sec. VC below, if 
we assume < e.) 



The origin of eq. (11 1 is easy to understand intuitively [5J. If each one of the r specified locations in I,, 
is bad, then the perturbation must be inserted at least once in each of these locations, and each insertion 
reduces the norm of the state by a factor of at least || iJ^"^ || . Inside each location, there is an earliest insertion 
of the perturbation that can occur at any time during the duration of the location, a time window of width 
t(). Integrating over the time of the earliest insertion of the perturbation inside each location, we obtain 



eq. (11 1. For the long-range noise model, a single insertion of the perturbation -ff<ij> can cause two circuit 



locations to be bad if qubits i and j are participating in separate gates, and therefore the noise strength is 



correspondingly higher (observe that s rather than e appears on the left-hand- side of eq. (13)) 



C. Assessment 

The results eq. (11) and eq. (13) are significant, because they demonstrate that quantum computing is 



scalable in principle for non-Markovian noise described by a system-bath Hamiltonian. Furthermore, this 
formulation of the threshold theorem has the noteworthy advantage that the argument works for any bath 
Hamiltonian Hb- The dynamics of the bath does not matter, as long as the perturbation Hsb is "local" 
and sufficiently weak. 



However, expressing the threshold condition as in eq. (11) or eq. (13) has serious drawbacks. First 
we should note that while in the local stochastic noise model we may interpret the noise strength e as 
an error probability per gate, in the non-Markovian noise model e is really an error amplitude. Since a 
probability is a square of an amplitude, requiring e < Sq in the local noise model is a far more stringent 
criterion than requiring e < Eq in the local stochastic noise model. Our analysis yields a much weaker lower 
bound on the accuracy threshold for the local noise model than for the local stochastic noise model because 
we pessimistically allow the bad fault paths to add together with a common phase and thus to interfere 
constructively. Most likely this analysis is far too pessimistic; it is reasonable to expect that distinct fault 
paths have only weakly correlated phases, and if so, then the modulus of a sum of N fault paths should 
grow like Vn rather than linearly in N. That is, if the phases of fault paths can be regarded as random, 
then we expect the probabilities of the fault paths, rather than their amplitudes, to accumulate linearly. 
An important open problem for the theory of quantum fault tolerance is to put this phase-randomization 
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hypothesis on a rigorous footing, and thereby to estabhsh a much higher estimate of the accuracy threshold 
for local noise. But we will not be addressing this problem in this paper. 



There are other drawbacks of the threshold condition eq. (Ill that we will try to address, however. One 
issue is that the norm of the system-bath Hamiltonian is not directly measurable in experiments, and it 
would be far preferable to state the threshold condition in terms of experimentally accessible quantities, 
such as the noise power spectrum. In fact, for otherwise reasonable noise models, the norm could be 

formally infinite (if for example the system qubits couple to unbounded bath operators such as the quadrature 
amplitudes of bath oscillators), and in such cases the threshold theorem has little force. 



In more physical terms, an undesirable feature of demanding small e where e is given by eq. ( 1 1 1 is that this 
condition requires that the very-high-frequency component of the noise be particularly weak, a requirement 
that seems not to be physically well motivated. To be concrete, suppose that 

iJ^-^ = ^^'^^ ® S^'') , (14) 
where S^""^ is a local Hermitian system operator with ||iS'^"-'|| = 1 and S^"-* is a Hermitian bath operator. 



Then combining the condition e < Eq with eq. (11) implies in particular that 



where A{uj) is the Fourier transform of the bath's two-point correlation function, defined by 

/OO 1 
^e-'-(*-*')A(a;) (16) 

{B^°'^ (t) denotes the interaction-picture bath operator) . Suppose that the fluctuations of the bath variables 
are Ohmic (and at zero temperature); that is, linear in frequency at low (positive) frequency and exponentially 
decaying at frequencies large compared to the cutoff frequency r~^: 

~ , , i 2TrAuje-^^'' if > 

^^^^ =\n -f ^ n ' 

10 if < 



where A is a positive dimensionless parameter quantifying the strength of the Ohmic noise. Then the 
threshold condition implies that 

^■(^f^<eo. (18) 

For the case of Ohmic noise, then, the quantity that is required to be small is linearly "ultraviolet divergent;" 
that is, it has a linear sensitivity to the high-frequency cutoff r^T^, which may be orders of magnitude higher 
than the characteristic frequency t^^ of the ideal computation. 

The extreme sensitivity of the threshold condition to the very-high-frequency noise seems surprising, 
since one's naive expectation is that noise with zero mean and frequency much larger than t^^ should nearly 



average out. This unsatisfying limitation of eq. (Ill, already pointed out in the original paper by Terhal and 
Burkard [5] (and later highlighted by Alicki ^17j and by Hines and Stamp [18]) may just be a shortcoming of 
the analysis, but conceivably it hints at a deeper problem for quantum fault tolerance. For example, it has 
been suggested |19] that during the course of a long quantum computation, an initially benign state of the 
bath may be pushed toward a far more malicious state that compromises the fault-tolerant protocol. Perhaps 
high-frequency noise with zero mean, which locally seems incapable of inflicting serious harm, has cumulative 
global effects that are surprisingly troublesome. Whether or not one suspects that the environment could be 
so cunning an adversary, stronger rigorous arguments establishing that quantum computing is robust against 
non-Mar kovian noise would surely be welcome! 

Our central result in this paper is a new estimate of the noise strength e that applies to a Hamiltonian 
description of Gaussian non-Markovian noise. We will formulate the noise model and state our result in 



Sec. |III| discuss some implications in Sec. IV and postpone the derivation until Sec. [Vj For this particular 



important class of noise models, we will be able to state a threshold condition that is less sensitive to very- 
high-frequency noise, though some sensitivity will still remain. The combinatoric analysis that leads to our 



result borrows substantially from the derivation in [7] of eq. (13), though the context is rather different 
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III. GAUSSIAN NOISE AND THE THRESHOLD CONDITION 



By "Gaussian noise" we mean a Hamiltonian noise model where the bath is a set of uncoupled harmonic 
oscillators, and each system qubit couples to a linear combination of oscillator quadrature amplitudes; hence 
(in units with h = 1) 

Hb = '^ujkalak , (19) 

k 

and 

X a 

where (j)a{x,t) is the Hermitian operator 

4>a{x,t) = ^(^gk,aix,t)ak + gl^„{x,t)al^ . (21) 

k 

Here a; is a label indicating a system qubit's position, and {aa{x), a = 1,2,3} are the three Pauli operators 
acting on qubit x. The Ofc's are annihilation operators for the bath oscillators, satisfying the commuta- 
tion relation [afe,a^,] = S^k', and gk,a{x,t) is a complex coupling parameter that determines how strongly 
oscillator k couples to qubit x at time t. 



As explained in Sec. II B (see also Sec. VIA), we may assume without loss of generality that the bath is 
prepared in a pure state j"^^) at the beginning of the computation. The Hamiltonian Hg + Hsb, along with 
the choice of the bath's initial state |^^), defines our noise model. The bath fluctuations will be Gaussian if 
the state l^"^) is a Gaussian state (that is, a generalized "squeezed" state) of the oscillator bath — a purified 
thermal state is a special case of such a squeezed state. 

It is useful to define the "interaction picture" bath operator (j)a{x,t) as 

Mx, t) = t)e-'"-' = (gkA^, t)ake-'^>'' + gljx, t)4e^-'=*) , (22) 

k 

and to define the bath's two-point correlation function as 

A{ai,Xi,ti;a2,X2,t2) ^ {'^%\<j)aAxi,ti)(l3c,^{x2,t2)\'^%) . (23) 

We will sometimes use the abbreviated notation 0(1) for (j)ai{xi,ti) and A(l,2) for A{ai,xi,ti;a2,X2,t2); 
we also define 

|A(1,2)|= \^{ai,xi,ti;a2,X2,t2)\ . (24) 

ai ,Q2 

When we say that the noise is Gaussian, we mean that the bath variable (f>a{x, t) obeys Gaussian statistics: 
all n-point bath correlation functions vanish for n odd, and the 2n-point function can be expressed in terms 
of two-point functions. Using ( • ) to denote the expectation value in the state l^*^), Gaussian statistics 
implies that 

(0(l)0(2)</.(3)---0(2n)) = Y Mh,i2)Aii3,ii)---A{i2„-i,i2n) , (25) 

contractions 

where summing over "contractions" means summing over the (2n)!/2"n! ways to divide the labels 1,2,3,... 2n 
into n unordered pairs. For example, if is a Gaussian variable, then the four-point function is 

(0(1)0(2)0(3)0(4)) = A(l, 2)A(3, 4) + A(l, 3)A(2, 4) + A(l, 4) A(2, 3) , (26) 

as illustrated in Fig. [T] This expansion of the 2rt-point function in terms of two-point functions is sometimes 
called "Wick's theorem." 
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FIG. 1: The four-point correlation function for a free field can be expressed in terms of products of two-point 
correlation functions by summing over all "contractions," where each contraction divides the four points into two 
unordered pairs. 



Now we can state our main result: Gaussian noise obeys the local noise condition eq. (j9|, with noise 
strength 

e^^C-maxf / / |A(1,2)|') , (27) 



Loc 



l.Loc J2,A11 



where C — 2e k, (2.34)^ is the numerical constant defined earlier (and where we have assumed < e). 
Here indicates that one leg (xi,ti) of the two-point function is integrated over a single location in 

the circuit: xi is summed over the qubits participating in a particular gate, and ti is integrated over the 
time interval in which that gate is executed. And indicates that the other leg {x2, <2) of the two-point 

function is summed over all system qubits and integrated over the entire duration of the computation. The 
maximum is with respect to all possible circuit locations for (xi,ti). The threshold condition e < Eq, with 



e given by eq. (27 1, now becomes a condition on the two-point correlation function of the bath. We note 



that the ordering of the operators 0(1) and 0(2) does not matter in eq. (27l because |A(1,2)| = |A(2, 1)|; 
changing the ordering modifies only the phase of A (1,2), not its modulus. 

Another noteworthy feature is that our estimate of e applies for an arbitrary system Hamiltonian. This 
property may seem unexpected at first, as we know that in some settings the damage caused by the noise 
can depend on the relation between the energy spectrum of the Hg and the power spectrum of the noise. 
For example, the spontaneous decay rate for a qubit with energy splitting huj depends on the noise power 
at circular frequency uj. How, then, can our threshold condition depend only on the noise spectrum and not 
on the energy spectrum of iJg? The answer is that by taking the modulus |A(1,2)| of the bath two-point 



function in eq. ( 27 1 we are already being maximally pessimistic about how the spectrum of Hs matches 
the noise power spectrum. Thus there are both advantages and disadvantages in formulating a threshold 
condition that is general enough to apply for any ideal system Hamiltonian. On the one hand we find a 
criterion for scalable quantum computing that can be stated easily and rigorously proved by a reasonably 
simple argument. On the other hand, the price of such rigor is that our stated criterion may be far more 
demanding than it really needs to be. 



The crucial assumption in the derivation of eq. (27) is eq. (20 1, where 4ia{x,t) is a "free field," i.e., obeys 



Gaussian statistics; thus eq. (21 1 could be regarded as merely a general phenomenological representation of a 
Gaussian field, and not necessarily as a fundamentally accurate microscopic description of the bath. Caldeira 
and Leggett have argued that noise is expected to be Gaussian, at least to an excellent approximation, 
in a wide variety of realistic physical settings where the system is weakly coupled to many environmental 
degrees of freedom. 

If the initial state of the bath is a thermal state with inverse temperature [3 — \ /kT, then the mean 
occupation number of each oscillator is determined by the Bose- Einstein distribution function; we have 

{<\alaM) = = {^IWaU^D - 1 , 

mauau^K) = = (m44'l*B> > (28) 



and therefore 



A(ai, xi, ti; a2, X2, is) - 9k,a, {xi,h)gla, {x2,t2)e-'^''^''-'''> (coth(/3wfc/2) + 1) 



2 

fe 



jE5U(^i,ii)5fe,a.(a;2,i2)e^"'=(*^-*^)(coth(/3c.fe/2)-l) . (29) 



2 

fc 
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Just to be concrete, consider the case where the noise is stationary and spatially uncorrelated — each qubit 
has a time-independent coupling to its own independent oscillator bath (though admittedly these are dubious 
assumptions when multi-qubit gates are executed). Then 

A(Q;i,a;i,ti;Q;2,a;2,i2) = <5a;i2;2^("ij2;i,ti;a2,a;i,i2) , (30) 

where 

/•oo 

A{ai,xi,h;a2,xut2)^ -^^e'^^^*!"*^) A„,„, (xi, tj) (31) 



27T 



and 



X f ^ j7rJ„^.„,(a;i,tj)(coth(/3tj/2) + l) if > 

^-^a, aoiXl, LU) ~ <, ■ (32) 

\7rJ*^,„^(a;i,c^)(coth(/3^/2)-l) if < ^ ^ 

Here Ja-^^a2{^ij^) is the Hermitian matrix 

^i,a2(a;i,w) = ^'^(w- Wfe)5fc,ai(a;i)gfe_„,(a;i) . (33) 

k 

The function jQ,j^,Q2(a:i,tj) is the spin-polarization-dependent power spectrum of the noise acting on qubit 
xi. If the energy splitting huj of the qubit is tunable, this function can be measured by observing the 
qubit's relaxation rate as a function of the energy splitting and the polarization. In principle, multi-qubit 
correlations in the noise can also be measured using quantum process tomography. 

IV. SOME IMPLICATIONS 

Before presenting our derivation of eq. (27) in Sec. |V] we will discuss a few of its implications. 



Dimensional criterion 



Our expression for e in eq. (27 1 involves a formal integration over all space and time. If the bath corre- 
lations decay slowly in space or time, this integral might diverge in the limit of a computation that is very 
wide, very deep, or both. In that case, our "threshold condition" cannot be satisfied asymptotically, and we 
cannot conclude that quantum computation is scalable. On the other hand, if the integral converges "in the 
infrared," then the threshold condition has value, as it establishes scalability if the coupling of the system 
to the bath is sufficiently weak. As long as e is finite, we can make it as small as we please by weakening the 
coupling of the qubits to the bath, i.e., by rescaling the perturbation Hsb, or equivalently by rescaling the 
field (j)a{x,t). 

What is the criterion for infrared convergence? Let us suppose that the qubits are uniformly distributed 
in D-dimensional space, and that the bath fluctuations are "critical;" i.e., algebraically decaying in space 
and in time. We say that the scale dimension of the field is (5 and the dynamical critical exponent is z if, 
for large scale factor A, the bath two-point function scales according to 

l^{\xi,yh-\x2,yt2)-^\-'^^l^{xi,h;x2,t2) ; (34) 

thus the time t scales like z powers of the spatial distance x. This means that the integral of the two-point 
function scales as 

'dt d^a; |A(x,t;0,0)| - ^ (35) 

where R is an infrared cutoff. Convergence in the infrared (finiteness of the limit R — > 00) is ensured provided 
that 



D + z<25] 



(36) 
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if this criterion is satisfied, then scalable quantum computing is achievable at weak coupling. If it is not 
satisfied, then scalable quantum computing might still be possible, but our version of the threshold theorem 
does not guarantee it. The same criterion was previously stated by Novais et al. [201 HI], though without 
rigorous justification. 



B. Almost-Markovian noise 



The noise is Markovian if the bath immediately "forgets" any quantum information it receives, so that the 
information never returns to the system. Though this is never strictly the case, it can be true to an excellent 
approximation if the characteristic correlation time of the bath is very short compared to the time resolution 
with which we monitor the system's behavior. In the Gaussian noise model, the noise is Markovian if the 
bath's two-point correlation function is proportional to a delta function of the time difference. 



A(ti, xi; t2, X2) cx S{ti - 12) 



(37) 



We could say that the noise is "almost Markovian" if the correlation function A is a sharply peaked function 
of the time difference, e.g., with width Tc much less than the duration to of a single quantum gate. In that 
case, our expression for the noise strength becomes 



C ■ max 

Log 



|A(1,2)| «rto 



l,Loc J2,A11 



(38) 



for each fixed value of ti, the sharply peaked t2 integral generates the factor F, and then integrating ti over 
the duration of the location generates the factor • 

We may interpret T as an error rate per unit time, and Tto as an error probability per gate. But note 

1/2 

that the noise strength e is not this error rate, but rather its square root (Fio) , in effect the amplitude 
of the error. In the Markovian case, fault paths really do decohere, and errors can be assigned probabilities 
rather than amplitudes. But our derivation of eq. (27 1 is too general and insufficiently clever to exploit this 



property; hence our threshold condition requires the error amplitude rather than its square to be less than 
So- 

Despite this deficiency, at least our threshold criterion for local noise, when applied to the almost- 
Maxkovian case, improves on the operator norm criterion eq. (11). If the two-point function has a narrow 
peak of width whose integral is F, then the height of the peak is of order F/t^, and this peak height can 



be interpreted as the norm squared of the system-bath Hamiltonian, as in eq. (151. Thus eq. (Ill estimates 
the noise strength as 



V^/tc ■ to 



(39) 



which is even more pessimistic than eq. (38 1. The estimate eq. (39) diverges as the ultraviolet cutoff r^T is 
removed. But the estimate eq. (38) depends on the area under the peak rather than its height, and so has 
a smooth limit as Tc ^ 0. 



C. Ohmic noise 



To further explore the sensitivity to high-frequency noise of our estimated noise strength, let us consider 
the Ohmic case, as in Sec. 



lie 



by eq. (17), then the real-time correlation function is 

duj 



If the Fourier transform A{uj) of the two-point correlation function is given 

-A 



A{t) 



2tt 



e-"^'A{uj) = 



{t-iT,Y 



(40) 



The function A{t) has a short-time singularity at i = that is regulated by the cutoff r^,, but the real and 
imaginary parts of A{t) both oscillate, so that its time integral vanishes: 



dt A{t) = A{uj = 0) = 



(41) 
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However the estimated noise strength e, which is required to be small by the threshold condition, depends 
on the integral of the modulus of A(t), 

m)\ ^ ^ , (42) 

which is of course nonnegative and has a nonvanishing time integral; the estimated noise strength is 

1/2 ^ 1/2 

(43) 



c ( dt [ ds\A{t-s)\) ^ VnCA ■ ( — 

J hoc J All ) \'^c 



This estimate is ultraviolet divergent, but comparing to eq. ( 18 1 we see that the divergence has been improved 
from linear to square-root dependence on the ultraviolet cutoff ^ . 

Despite the improvement, the surviving ultraviolet sensitivity in this estimate of e (for the case of Ohmic 
noise) is troubling, as it significantly reduces the class of noise models for which we can conclude that quantum 
computing is scalable. Therefore it is important to understand the origin of the ultraviolet divergence. One 



might suspect at first that the ultraviolet sensitivity arises because the range of the dt integral in eq. ( 43 1 
is a window of width to with a sharp boundary. But in fact, for Ohmic noise the sharp boundary generates 
only a mild logarithmic ultraviolet divergence, not a power divergence. The actual reason for the power 
divergence is that we have pled complete ignorance regarding the frequency spectrum of the ideal system 
Hamiltonian Hs- Therefore, we are required to be maximally pessimistic about how the oscillating phase of 
the wave function arising from the ideal system dynamics matches with the phase of the bath fluctuations. 
For that reason our estimate of e involves an integral of the modulus of A(t) rather than A(t) itself. 

With further assumptions about the ideal system dynamics we ought to be able to exclude this highly pes- 
simistic scenario, leading to an estimate of e with milder ultraviolet sensitivity. A natural idea is to attempt 
a "renormalization group improvement" of the noise model; that is, to "coarse grain" in time, stretching 
the short-time cutoff Tc, while adjusting the bath fluctuations to keep invariant the effect of the noise on 
the system. Formally Ohmic noise is "marginal," meaning that the naive renormalization-group scaling 



generates only logarithmic cutoff dependence, not the square-root dependence found in eq. (43 1. However, 
rigorously justifying this naive scaling turns out to be technically difficult, in part because Hsb couples the 
system operators to unbounded bath operators in the Gaussian noise model. It might be interesting to see 
if further technical assumptions (which one would hope to justify a posteriori) about the system-bath state 
l^ssit)) during the course of the computation would lead to a less ultraviolet-sensitive threshold condition, 
but we have not yet succeeded in finding useful results with this character. 



DERIVATION 



In this section we will derive eq. ( 27 1 . Our task is to estimate a value of e such that 



|||vI'|-i(X.))|p = (vI/|l5^(X.)|vI/^^i(X.)) = (vI/0b|C/5b (I.)^C^SB (I.)I*SB> < e'^ (44) 

(see eq. ([9])). Here Usb is the joint system-bath time evolution operator from the beginning of the compu- 
tation until just before the measurements that will read out the final result, and Ug'^{lr) denotes the sum 
of all the terms in the perturbation expansion of Usb such that the perturbation Hsb is inserted at least 
once in each of the r specified locations in the set Ir- The initial state of the system and bath is assumed 
to be the pure product state l^'g^) = l^'g) ® l^s), where the bath's state l^*^) is Gaussian; that is, the 
expectation values of the bath operators {(j>a{x, t)} obey Gaussian statistics in this state. For now we assume 
that system qubits are prepared only at the start of the computation and measured only at the end, with 
evolution governed by the Hamiltonian H = Hs + Hb + Hsb in between; this assumption can be relaxed, 
as we will discuss in Sec. IVII 



A. Keldysh diagrams 



The terms in the perturbation expansion can be associated with diagrams, where in each diagram the 
perturbation Hsb is inserted at a specified set of points in spacetime. We may think of the sum of these 
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FIG. 2; The diagram in (a) represents the norm squared of the system-bath state Usb\'^%b) ■ By bending it into 
a hairpin shape we obtain the "Keldysh diagram" shown in (&). A "marked location" in the circuit appears twice 
in the Keldysh diagram, once on the upper branch (as a contribution to Usb), and once on the lower branch (as a 
contribution to Ulg). 



diagrams as representing the expectation in the state l^'^s) of the product of the forward evolution operator 
of the system and bath (i.e., Usb), from the initial to the final time, followed by the backward evolution 
operator {i.e., U^g), from the final to the initial time. It is convenient to fold the diagram into a hairpin shape 
as in Fig. [2] so that the diagram has two branches that are aligned in time. The upper branch represents 
the evolution forward in time; here the inserted perturbations are "time-ordered," meaning that operators 
inserted at later times act after operators inserted at earlier times. The lower branch represents the evolution 
backward in time; here the inserted perturbations are "anti-time-ordered," meaning that operators inserted 
at earlier times act after operators inserted at later times. Furthermore, all operators inserted on the lower 
branch act after operators inserted on the upper branch. Diagrams with this structure are sometimes called 
"Keldysh diagrams." 

In each diagram, the evolution of system and bath is governed by the uncoupled Hamiltonian Hq = 
Hs + Hb in between successive insertions of HgB- Since |^'55) is a product state, the diagram's contribution 
to the expectation value factorizes into the product of a system expectation value and a bath expectation 
value. Consider a diagram where the operator CTq,^ ® <j)aj is inserted on the upper branch acting on qubit Xj 
at time tj, for j = 1, 2, 3, . . . , n, and the operator tr^^ ® (pfj^ is inserted on the lower branch acting on qubit 
Uk at time Sk, for k = 1, 2, 3, . . . , m. Taking into account the uncoupled evolution in between insertions, and 
the Keldysh operator ordering rules (where t„ > > ■ ■ ■ > ti and Sm < Sm-i < • ■ ■ < si), this diagram's 
contribution to the expectation value {"^'^bIUsb^sbI'^'sb) 

Here aa{x,t) = Us{t)^ <ya{x)Us{t) and (j)a{x,t) = UB{ty 4'a{x,t)UB{t) are the "interaction picture" operators 
that evolve according to the uncoupled system-bath dynamics. Using the Gaussian statistics {i.e., "Wick's 
theorem"), the bath expectation value can be expressed as a sum of products of Keldysh-ordered two-point 
correlation functions. Summing {ai, a2, . . . a„} and {(3i, (32, ■ ■ . , /3m} from 1 to 3, summing {xi,X2, ■ . . Xn\ 
and {yi, 2/2, • • ■ 2/m} over all qubits, integrating {^1,^2, • ■ • } and {si, S2, ■ . ■ , Sm} over the interval from the 
initial to the final time, and finally summing n and m from to (X), we would recover the full expectation 
value {^%b\^sb^Sb\^'sb) — 1- More precisely, to generate the full system-bath evolution operator Usb, for 
each fixed n we sum {(xi, ti), {x2, t2), {x^,t^), . . . , {x„, t„)} over all time-ordered sets of n spacetime positions 
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< time t 




< time s 



FIG. 3: A Keldysh diagram contributing to {^"^g {2r=i)\'^^B 0-^='^)) ^ where black dots are insertions, red lines 
are contractions, and the shaded rectangle on each branch indicates the one marked location. In this diagram, the 
marked location is "bad" because there is an insertion inside the marked location on each branch. 

inside the circuit. This is equivalent to integrating each (xj,tj) over all spacetime, and then dividing by n\ 
to compensate for the overcounting of the sets (each set has been included nl times). Similarly, to generate 
UgQ, for each fixed m we sum {{yi, si), (2/2, S2), (2/3, S3), • ■ • , {Vm, s,n)} over all anti-time-ordered sets of m 
spacetime positions inside the circuit. 

B. One marked location 

But we do not want to sum all the diagrams; instead we want to sum all and only those such that all of 
the r locations in the set are bad on both the upper and lower branches. Let us first consider the case 
r — 1, where one particular circuit location has been "marked" as bad. To get a useful bound, it is helpful 
to organize this sum in a particular way. Because the marked location is bad, there must be at least one 
insertion of the perturbation inside this location on both the upper and lower branch, as in Fig.[3| Therefore, 
there must be an earliest insertion inside the marked location on each branch. Also, if the marked location 
is a two-qubit gate, then the earliest insertion could act on either one of the two qubits. For now, let us fix 
on each branch the time of the earliest insertion inside the marked location, the qubit on which the earliest 
insertion acts, and the corresponding Pauli operator. Later on we will integrate the time of the earliest 
insertion over the marked location, and also sum over Pauli operators and the qubits at the location, but 
not yet. 

With the earliest insertion fixed on each branch, and after expanding the bath expectation value in terms 
of bath two-point functions, we can identify two classes of diagrams. In "class 1" diagrams, the earliest 
insertions on the two branches are contracted with one another, and in "class 2" diagrams they are not, as 
shown in Fig. [4] We will find upper bounds on the sum of all the diagrams in each class. 

1. Class 1 diagrams 

First consider the class 1 diagrams. Each diagram in the class has as a factor the two-point function 
A(/3, y, s; a, x, t), where CTq. acts on qubit x at time t on the upper branch, and acts on qubit y at time s 
on the lower branch. The simplest diagram in the class, which we will call the "skeleton" digram, has only 
two insertions and one contraction; its value is 

{^°\at,{y,s)a^{x,ml.) x {^%\cl>p{y, s)cj,^{x,t)\^%) . (46) 

The other class 1 diagrams are obtained by "dressing" this skeleton in all possible ways, by adding further 
insertions and contractions. 

However, remember that we have fixed t and s to be the times of the earliest insertions of the perturbation 
on the upper and lower branches respectively. Therefore, an additional insertion is "legal" only if it avoids 
times earlier than t inside the marked location on the upper branch, and times earlier than s inside the 
marked location on the lower branch. With this proviso, all class 1 diagrams arise when we dress the 
skeleton class 1 diagram with all possible additional legal insertions and contractions. 
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excluded excluded 




excluded excluded 

(a) ib) 

FIG. 4: Skeleton Keldysh diagrams contributing to {\['5^(Tr=i)|^'ss'(Ir=i)), showing the earliest insertion of Hsb 
inside the marked location on both branches. For the class 1 diagram shown in (a), the earliest insertions on the two 
branches are contracted with one another, and for the class 2 digram shown in (6), the earliest insertions are contracted 
with other insertions elsewhere. Other diagrams in each class are obtained by dressing the skeleton diagrams with 
additional insertions and contractions, except that no insertions are allowed inside the marked locations at times 
before the earliest insertion. 



The class 1 diagrams can be summed up and expressed in a compact form. For this purpose we introduce 
what we call the "hybrid picture," which is in a sense intermediate between the interaction and Heisenberg 
pictures. Let us define the "hybrid Hamiltonian" i/^y*^ by 

in each marked location prior to the earliest insertion, 

(47) 

Hsb elsewhere. 

That is, in the hybrid Hamiltonian, the perturbation Hsb "turns off" inside the marked location before time 
t on the upper branch and before time s on the lower branch. When we sum up all the legal insertions and 




contractions, the value eq. (46) of the skeleton diagram is transformed into 



{^%B\^T^y^')^T{^,m%B) X (m0/3(y,s)0a(x,i)m> ■ (48) 

Here the interaction picture operator aa{x^t) has been replaced by the hybrid-picture operator a^'°{x^t) — 
{t)'^ a a{x)lfs^^ {t) , and furthermore the expectation value of the system operator is now evaluated in the 
system-bath state 1^5^) rather than the system state |^'5). If the expression eq. (48) is expanded in powers 



of ifg^, all the legal insertions and only the legal insertions are generated. And for each choice of insertions, 
evaluating the bath expectation value using Wick's theorem yields a sum over all the contractions included 
in class 1. Thus, the assumption that the bath fluctuations are Gaussian is crucial for the derivation of 



eq. (48) 



Therefore, we obtain an exact expression for the sum of all the diagrams in class 1 from eq. (48) by now 
integrating the earliest insertions on both branches over the marked location, finding 

Class 1 Diagrams 

= ^ / di/ d.^(vI/Os|a;^^'^(y,.)ahy'^(x,t)|vI/"^5)(vI/O,|0^(y,.)0„(x,t)|vI/O,) . (49) 

x,!/eLoc"'Loc JLoc 

Now, the operator a^^^{x,t) differs from the Pauli operator <Ta{x) by a mere unitary change of basis, and 
therefore has sup norm ||(Tjjy^(a;, t)|| = 1. From eq. (49) we then conclude that 



Class 1 Diagrams 



< E / / dsY^\{^l\c^p{y,s)M^,t)\^%)\ ^ ( [ |A(1,2)|, (50) 

x,yeLoc"'Loc JLoc J1,Loc J2,Loc 
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in the notation of eq. (24 1. This is our bound on the sum of all class 1 diagrams. 

Note that in eq. (49 1 the integrand is the product of a bath two-point correlation function and a "hy- 
bridized" system two-point correlation function. If the bath correlation function has a high-frequency compo- 
nent and the system correlation function does not, then the contribution to the time integral arising from the 
high-frequency bath fluctuations may be strongly suppressed. But the estimate in eq. (50 1 is very crude — it 
applies irrespective of the frequency spectrum of the system correlation function — and we could get a better 
estimate if we assumed that the system correlation function has little power at high frequency. Furthermore, 
such an assumption seems physically reasonable; the natural frequencies of the system dynamics are set by 
the energy splitting of the logical states and by the characteristic time scale (e.g., the gate duration to) on 
which the time-dependent system Hamiltonian varies. Unfortunately, though, finding a rigorous bound on 
the high-frequency hybrid system correlation function is not trivial, because the hybrid Hamiltonian includes 
the system-bath coupling Hsb, an unbounded operator. If the bath has a low temperature, then we expect 
that high-frequency bath oscillators are likely to be in their ground states, but to prove a threshold theorem, 
we need to rule out relatively unlikely events that might foil the computation. That is not so easy to do, 
especially if the Hamiltonian is unbounded. So in this paper we will mostly pursue the consequences of the 
crude estimate eq. (50 1 and other similar estimates, leaving for future work the challenge of improving the 
results via tighter bounds on the integral in eq. (49 1. However, we can obtain a stronger bound for the case 
of pure dephasing noise, discussed in Sec. |VII[ 



To prevent confusion, we remark that our "hybrid picture" is a rather strange concept, in that the Hamil- 
tonian that governs evolution on the upper branch of the Keldysh diagram is different than the Hamiltonian 
for the lower branch. If we were using Keldysh diagrams the way they are usually used, to track the evolu- 
tion of the system's density operator, this feature would be unacceptable because time evolution would not 
preserve the density operator's trace. For us, though, the hybrid Hamiltonian is merely a technical trick for 
bounding the sum of a class of diagrams, and should not be interpreted literally as the Hamiltonian of a 
physical system. 



2. Class 2 diagrams 

Now consider the class 2 diagrams. The earliest insertions inside the marked location on the upper and 
lower branches of the Keldysh diagram are not contracted with one another; rather each is contracted with 
an insertion at another location. Let us say that the earliest insertion at (x, t) in the marked location on 
the upper branch is contracted with an insertion at spacetime position (z,m), which could be on either the 
upper or lower branch, and that the earliest insertion at (y, s) in the marked location on the lower branch is 
contracted with an insertion at (w, v), which also could be on either the upper or lower branch. In principle 
z, w could be the spatial labels of any two qubits in the computer, and u, v could be any time between the 
initial and final time, except that the insertions at (z,m) and {w,v) must be legal; that is, neither can be 
inside the marked location on the upper branch earlier than t, or inside the marked location on the lower 
branch earlier than s. 

For the class 2 diagrams, let us for now imagine fixing the insertions at {z,u) and at {w,v) that are 
contracted with the earliest insertions; we will integrate over these spacetime positions later on. The simplest 
diagram in the class, the "skeleton" diagram with only two contractions, has the value (except for a phase 
factor that depends on the choice of branch for the insertions at (z, u) and (w, v)) 

{^°s\T* {apiy, s)as{w, v)a^{z, u)a„(a;, <)) |v|/0 ) 

x(n|T*(0^(y,s)(/.,-(u;,«))|*Oj)(*°,|T*(</>^(z,u)(/.„(x,t))|vI'O,) . (51) 

where T* denotes the proper Keldysh ordering. Other diagrams in class 2 are obtained by dressing this 
skeleton with additional insertions and contractions in all possible legal ways. As in our discussion of the 
class 1 diagrams, summing all the ways to dress the skeleton transforms the interaction-picture system 
operators into hybrid- picture operators, yielding (up to a phase) 

(^°5B |r* (a^^^y, 5)^r(u;, v)a^y\z, u)aT{x, t)) 

x(*o,|T*(0^(y,s)(/.5(ii^,«))m>(*B|7^*('^7(^-")'/'a(x,t))m) • (52) 
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To obtain the sum of all class 2 diagrams, we now sum over Pauli operator labels and spacetime positions, 
obtaining 



Class 2 Diagrams 



\J / dt ds du dv 2_, (phase) 

...^a,„"'loc jloc JAii' Jaw . a r 



a:,yGLoc 2,™eAll' "'Loc JLoc JAII' JAII' ^ j^ ,.^ g 

X {^°b\T* {af\y, s)a^y\w, v)<j^y\z, u)aT{x, t)) 

x(vI/O,|T*(0^(y,s)(/.,-(u;,t;))|vI/^)(vI/^|r*(</)^(z,u)(/.„(x,t))|vI'°3) . (53) 

Here the notation All' indicates that the qubit positions z and w are summed over both branches of the 
Keldysh diagram, and that the times u and v are also integrated over both branches. Furthermore, it is 
understood that the integral over u and v is restricted to legal insertions (times in the upper-branch marked 
location earlier than t, and in the lower- branch marked location earlier than s, are excluded). 

As for the class 1 diagrams, we obtain a bound on the sum of class 2 diagrams by noting that the 
expectation value of the product of system operators has modulus no larger than one, finding 



Class 2 Diagrams 



<(2 / |A(1,2)|) . (54) 

l.Loc J2,A11 



To obtain eq. (54 1 we have noted that the Keldysh ordering is irrelevant when we take the modulus of the 
bath two-point function, and that in the sum of the moduli of all diagrams we can extend the integral over 
legal insertions to an integral over all insertions to obtain an upper bound. Here the notation All indicates 
that the second leg of the correlation function is summed over all qubits and integrated over all times; the 
factor of 2 accompanies the integral because the insertions at {z,u) and at {w,v) can be on either one 
of the two branches of the Keldysh diagram. 

For our upper bound on the sum of class 1 diagrams, both legs of the bath's two-point function are 
integrated over the marked location, while in the upper bound on the sum of class 2 diagrams, one leg 
is integrated over the marked location, while the other is integrated over all qubits and all times. This 
distinction is not so important if the spatial and temporal correlations decay rapidly, but it can be quite 
important if the decay is slow, as we have already discussed in Sec. IV A[ The upper bound on the sum of 



class 1 diagrams is still valid, though weaker, if we extend the integral for one of the legs from the marked 
location to all of spacetime. Then by adding together the contributions from diagrams of both classes, we 
find 

ll*SB(I.=i)f <^^ + 4i^^ (55) 

where 

E = m^x( [ [ \A{1,2)\) . (56) 

VJi,Loc J2,A11 / 

If E is small (the typical case of interest), then the class 1 diagrams dominate, and the contribution from 
class 2 diagrams is higher order in E. We emphasize again that the integral in the definition of E is 



confined to a single branch of the Keldysh diagram, and that the factor of 2 in eq. (54 1 arises because the 



insertion inside the marked location can be contracted with an insertion on either branch. 



C. Many marked locations 

Now we want to consider the case where there are r marked locations. The perturbation Hsb must be 
inserted at least once in each of the r marked locations, on both the upper and lower branches of the Keldysh 
diagram. In order to get an upper bound on the sum of all such diagrams, we will organize the sum following 
the same ideas as in our discussion of the r = 1 case. In each marked location on each branch, there must 
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FIG. 5: Skeleton Keldysh diagrams contributing to {^'^^(X,— 2)|^'sB'(2^r=2)), showing the eariiest insertion of Hsb 
inside each of two marked locations on both branches. There are three skeleton diagrams with two internal contrac- 
tions, six skeleton diagrams with one internal contraction, and one skeleton diagram with no internal contraction, 
where we say that a contraction is "internal" if it links two earliest insertions. 



be an earliest insertion of the perturbation, and this earhest insertion is contracted with another insertion 
elsewhere, which could be on either branch. 

A skeleton graph contains a "minimal" set of contractions — each contraction in the skeleton has at least 
one leg attached to the earliest insertion in a marked location. We distinguish two types of contractions 
in the skeleton: an "internal" contraction links two earliest insertions, and an "external" contraction links 
an earliest insertion with another legal insertion which is not an earliest insertion. The skeleton diagrams 
can be classified according to the number k of internal contractions. If there are r marked locations, and 
therefore all together 2t marked locations between the two branches, then k can vary from to r; if there 
are k internal contractions then there are 2(r — k) external contractions. For r = 1, a skeleton diagram in 
what we called class 1 has k = \ internal contractions, and a skeleton diagram in class 2 has A; = internal 
contractions. For r = 2, the ten distinct skeleton diagrams are shown in Fig. |5] There are three diagrams 
with two internal contractions, six diagrams with one internal contraction, and one diagram with no internal 
contractions. 

The value of a skeleton diagram, with all insertions and contractions fixed, can be expressed as a product 
of the expectation value of a string of Keldysh-ordered interaction-picture system operators 



{'l!%\T* (cr(ii) • • • (T(ife)(T(ji) • • • crOfc)CT(mi) • • • (7{m2{r-k))<y{ni) ■ ■ ■ a{n2{r- 
and a product of Keldysh-ordered bath two-point functions 

(phase) X (v|/O,|^*(0(^l)0(Jl))m>•••(mT*WH.)'^(Jfc))l*^: 



-.)))l*°5> 



\T* (0(mi)</'(ni)) ■ ■ ■ {^%\T* ((/.(m2(,_fc))0(n2(.-fe)) |*^) 



(57) 



(58) 



Here we have attached labels 1, 2, 3, . . . , 2r to the 2r marked locations on the two branches, and e.g., 
is shorthand for ctq;^ (xi^ , t^j^), where {xi-^,ti-^) is the spacetime position of the first insertion inside marked 
location number ii. In eqs. (57 1 and (58), locations ii through ik arc internally contracted with locations ji 

are contracted with insertions 



through jfe, while the remaining earliest insertions in locations nii to m 



labeled rii through n 



2{r-k) 



which are not earliest insertions. 



2(r-fe) 



When we sum up all ways to dress the skeleton with additional insertions and contractions, we obtain an 
expression of the same form, but with the interaction-picture system operators replaced by hybrid-picture 
system operators, and l^'^) replaced by l^'gs). Bounding the system operator expectation value by one, 
and summing over the Pauli operator labels, we obtain a bound on the sum of all dressed skeleton diagrams. 



2(r-k) 



^ Dressed Skeletons < JJ^ I A(ia,jQ) I |A(mb,nb) 



(59) 



b=l 



Now, keeping fixed the choice of which locations are internally contracted with one another, we can integrate 
each ia, ja, and mf, over the specified marked location, while integrating ni, over all locations on both 
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branches. The integral is bounded above by 

2{r-k) 



1 1 ^Dressed Skeletons I < J|G(ia,ja) \{ 2E{mb) , (60) 



a=l b=l 



where 

G(za,Ja)=/ / |A(1,2)|, E{m,)^ [ f |A(1,2)|, (61) 

Jl,Loc(ia) ■/2,Loc(i„) ■/l,Loc(mb) ■/2,A11 

and where the factor of 2 multiplying E{mi,) results from summing rib over both branches. 



Now, with the number k of internal contractions still fixed, we can sum eq. (60 1 over all the ways that k 



contracted pairs of locations can be chosen from among 2r locations. We note that 

k 

2r 



E \X{G{^.^^A^T^ EG(m) ; (62) 



contractionsf/e) \a— 1 / \ — i 



this inequality holds because ^X^i j=i contains the term corresponding to each contraction k\ times, 

and also contains other nonnegative terms. Furthermore, 



2r 



E G{i,j)<J2E{i) , (63) 



i.i = i i=l 

i<j 



because the expression on the right-hand-side contains all the terms on the left-hand side, plus other non- 
negative terms. We conclude that 

J2 y |E Dressed Skeletons < ■^(2r£;)''(2i;)2('^-'=) , (64) 

contract ions (fc) 



with E defined as in eq. ( 56 ) 



It remains to sum over k, the number of internal contractions; 

/ r \ ^ k 

|EDiagrams| < E H / IE^^^*^*^^^ Skeletons < ^ ^(2£;)^''"'' . (65) 

A;— \ contractions(/i;) / k—0 

Therefore, if we assume that 2E < 1, 

r ^ oo 

m^'sBi^rM' < {2EYY.y,^'^^y ^ (^^^'SI ^ = (2e£^)'' = ' (66) 
fc=o ■ fc=0 

where 



e = ^j2eE w 2MyJ E . (67) 



Thus we have derived eq. (|27|. We note that eq. (66 1 also applies for r —\. and in that case is weaker than 
the upper bound || |^'|'g'(Ir=i)) ||^ <E + \E'^ ^ E(\+\E) < 3E found in eq. ([55]), assuming E < 1/2. 



VI. GENERALIZATIONS 

A. Initial state of the bath 

In our analysis, we have found it convenient to assume that the initial state of the bath is a pure state, 
but the analysis also applies if the bath starts out in a mixed state. Actually, we can "purify" a mixed state 
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of the bath by introducing a fictitious reference system R, and choosing the pure state I^'^a) of BR so 
that 

P%^trn{\KR){KR\) ■ (68) 

Our previous analysis then apphes, if we consider BR to be an "extended" bath, such that the system 
interacts with only the subsystem B of the extended bath. 

However, the state is not arbitrary; for our argument to apply it must be chosen so that the 

interaction-picture free field (j)a{x,t) has Gaussian statistics in this state. For this it suffices for 1^'^^.) to 
be an undisplaced Gaussian squeezed state. If we consider the reference system R, like the bath B, to be 
a system of uncoupled oscillators, then an undisplaced Gaussian squeezed state is obtained by applying a 
unitary transformation V to the oscillator ground state |OB,Ofl), where the action of V on the annihilation 
operators is homogeneous and linear: 

V-^akV = J2 ^'^k^a, + Nkja] ; (69) 

3 J 

here the set {ak\ includes annihilation operators for both the B oscillators and the R oscillators, and the 
matrices M and N obey constraints that ensure preservation of the commutation relations. V satisfies 



eq. (69) if its logarithm is strictly quadratic in creation and annihilation operators, with no linear term. A 



special case is the thermal state of the bath, whose purification can be written as 

exp [ ^ rfc (a^^^a^^^ - as,feafl,fe) J |0b, 0^) = (g) [ ^l-ll I K)b , K)j^) J , (70) 



riA;=0 



where 7^ = tanh = e and (3 is the inverse temperature. But our arguments apply to any Gaussian 



state 1^1 Ob, Qr), since the action of V in eq. (69) maps free fields to new free fields that still satisfy Wick's 
theorem and have mean zero. 



B. Measurement and entropy removal 

For fault-tolerant computing to work, there must be a mechanism for flushing the entropy introduced 
by noise. In the scheme we have analyzed, entropy is removed from the computer because error-correction 
gadgets use a supply of fresh ancilla qubits that are discarded after use. It has been understood that the 
initial state \^%) of the system includes all of the ancilla qubits that will be needed during the full course 
of the computation. But to model the actual situation, in which ancilla qubits are prepared as needed just 
before being used, we imagine that ancilla qubits are perfectly isolated from the bath until "opened" at the 
onset of the gadget in which they participate. Similarly, we imagine that the measurements of all ancilla 
qubits are delayed until the very end of the computation, but that these qubits are "closed" (their coupling 
to the bath is turned off) at the conclusion of the gadget in which they participate. With these stipulations, 
our noise model is equivalent to one in which ancilla qubits are repeatedly measured, reset, and reused. 

We model the noisy preparation of an ancilla qubit as an ideal preparation followed by interaction with 
the bath for a specified duration. Since the state of the bath may evolve during the computation, the noise 
in the preparation may also depend on when the qubit is prepared. Still, we are taking it for granted that 
"pretty good" fresh ancillas can be prepared at any time, or equivalently that qubits can be effectively erased 
at any time. Implicitly, we have adopted a "two-reservoir" hypothesis. One reservoir, which we have called 
the "bath," interacts with the system qubits, causing noise. The other reservoir is the entropy "sink," which 
carries away heat each time a qubit is erased. In our model, the bath and the sink are uncoupled, and the 
sink has infinite heat capacity — it never heats up no matter how many qubit erasures occur. 

Because the bath interacts with the system, in principle it might be driven far from its initial state in a 
manner that depends on the ideal computation being simulated. Our arguments have shown that, at least 
if the bath is a system of uncoupled oscillators and its initial state is Gaussian, the bath will not be pushed 
to a highly adversarial state that overpowers our efforts to make the computation robust. One wonders 
how that conclusion could be altered if we relax the two-reservoir hypothesis by coupling the sink and the 
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bath, or by eliminating the sink entirely. For example, we could attempt to model measurement and erasure 
more realistically by including entropy flow from the system to the bath. In that case, a bath of unbounded 
heat capacity would be needed to remove entropy from a noisy computation of unbounded size, and our 
modeling would need to incorporate a mechanism for equilibration of the bath. The goal would be to specify 
conditions under which the entropy flow from system to bath can be maintained well enough to support 
scalable quantum computation. For now, we put aside this ambitious project as an open problem for future 
consideration. 

C. Postselection 

Some fault-tolerant gadgets include postselection. For example, a gadget might consume a disposable 
piece of "quantum software," an encoded ancilla state that is prepared offline and verified before coming 
into contact with the encoded data processed by the computation. The verification procedure includes 
measurements that check the accuracy of the preparation of the software, and the software is accepted 
only if the measurements have suitable outcomes; otherwise the software is rejected and the preparation is 
repeated. Therefore, estimates of the reliability of gadgets are conditioned on acceptance of the software, 
which is said to have a "postselected" state. 

Some fault-tolerant protocols, such as those analyzed in fTUl [HJ [T^ make "extreme" use of postselection, 
meaning that the software is usually rejected and the preparation is typically repeated many times before 
it finally succeeds. For such protocols, noise with adversarial correlations can be a formidable foe, since the 
adversary is empowered to enhance the probability of acceptance for atypical fault paths that are especially 
damaging. Thus, the threshold estimates based on extreme postselection proved in \lOi apply for inde- 
pendent noise but not for local stochastic noise. But other protocols, such as those analyzed in [51 151 
make only "modest" use of postselection, meaning that software is accepted with reasonably high probability. 
For such protocols, a gadget's failure rate, conditioned on acceptance of the software, can be easily estimated 
using the Bayes rule, even for the case of local stochastic noise. 

The threshold estimate for local noise whose proof is sketched in Appendix A applies to a protocol with 
no postselection at all. For local noise, as for local stochastic noise, we do not know how to extend this proof 
to a protocol with extreme postselection. But it can be extended to a protocol with modest postselection. 
This observation is useful, because threshold estimates based on protocols with modest postselection are 
typically higher than estimates based on protocols without postselection. 

Before considering the case of local coherent noise, we recall how protocols with modest postselection can 
be analyzed for the case of local stochastic noise [B^; to be concrete, we will discuss the case where the level-1 
gadgets are based on a quantum error-correcting code that corrects one error in a block. A properly designed 
gadget processes encoded data correctly if the software is accepted and the gadget contains no more than 
one fault. Therefore, the joint probability Pjoint of acceptance of the software and failure of the gadget is 
bounded above by Be^ + De^ for local stochastic noise with strength e, where B is the number of malignant 
pairs of locations in the gadget where faults can cause failure (assuming the software is accepted), and D 
is the total number of sets of three locations in the gadget. On the other hand, the software will surely be 
accepted if there are no faults in the software preparation circuit, so the probability of acceptance Pacccpt is 
bounded below by 1 — Ce, where C is the total number of locations in the circuit for software preparation 
and verification. Using the Bayes rule, we obtain an upper bound on the probability ^conditional of failure 
conditioned on acceptance: 

p _ -Pjoint. ^ Be^ + De^ ^ _ ^(i) rvn 

^conditional — "75 S . _ ^ S £ /eo — £ , ('-Lj 

-'accept O £ 

where 

£0 ' = ^(i? + C) (1 + v/1 + ^DI{B + (72) 

is determined by solving the equation {Be\ -I- ZJeq) /(I — Ceq) — £o- This argument gives a useful result if our 
lower bound on Paccopt is not too small. In practice, it is often the case that C <^ B and therefore Ceq ^ 1, 
so that the "postselection correction" arising from division by Paccept is a small effect. 
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There is another way to describe this estimate that is more readily generaHzed to the case of local 
coherent noise, and also clarifies why the estimate applies to adversarial local stochastic noise. Imagine that 
n software preparation and verification attempts are executed in parallel, where we label the attempts by an 
index i — 1, 2, 3, . . . , n, and suppose for the moment that the noise is uncorrelated. Now we distinguish n+1 
possible ways for the gadget to be bad, depending on which preparation attempt (if any) is the first to be 
accepted. If ancilla 1 is accepted, then the gadget fails with probability /joint- But ancilla 1 is rejected with 
probability Project = 1 ~ -Pacccpt, so the probability that ancilla 1 is rejected, ancilla 2 is accepted, and the 
gadget fails is -Project /joint- Similarly, the probability that ancilla m is the first to be accepted and the gadget 
fails is P™gj.tPjoint, and the probability that all n ancillas are rejected is Project- Summing the probability of 
all failure scenarios, we find 

p _ p. I \ ^ pm-l 1 I pn _ Pjo'mt /-. _ ^ i pn _ -^j°i"t , pn f i _ jfjoint^ A /-o\ 

J^fail — ^joint I ^reject I ^ ^reject ^ p V ^rejectj ^reject ~ p ' ^reject I ^ p j \''^) 

accept -'^accept \ -'^accept / 




In the limit n — s- oo, we recover the estimate eq. (71 1, and even for n = 2 we have Pfaii = 0{e ). Furthermore, 
the upper bound on Pfaii obtained from upper bounds on Pjoint and Project applies not only to independent 
noise but also to correlated local stochastic noise — it can be regarded as an estimate of the efi^ective noise 
strength e*^^^ after one level reduction step. For local stochastic noise, we sum over all failure scenarios at 
each of r marked locations, and conclude that the probability that all r locations are bad is bounded above 
by 

We can also apply this strategy of summing over all failure scenarios in the case of local coherent noise. 
First we note that, to preserve the framework assumed in Sec. |V] we may imagine that all measurements 
in verification steps are postponed until the end of the computation. In the actual circuit, the "verifica- 
tion qubits" are measured inside gadgets, and then subsequent operations are conditioned on the classical 
measurement outcomes. To model this circuit in the framework where all measurements are postponed, we 
suppose that a verification qubit decouples from the bath at the time when it is measured in the actual circuit, 
and we replace operations conditioned on measurement outcomes by noiseless quantum gates conditioned 
on the state of the verification qubit, after decoupling from the bath but prior to being measured. Then we 
can estimate || |*^'g'(I^^^))||2 by summing over n + 1 failure scenarios at each of the r marked locations. In 
scenario 1, ancilla 1 is accepted and the gadget using ancilla 1 (including the preparation and verification 
of ancilla 1) has two or more faults. In scenario m, for m — 2,3, . . . ,n, the first to — 1 ancillas are rejected, 
ancilla to is accepted, and the gadget using ancilla to has two or more faults. In scenario n+l, all n ancillas 
are rejected. Since the scenarios are perfectly distinguishable, they should be summed incoherently. 

Now, in order for an ancilla to be rejected, there must be at least one fault in the circuit that prepares and 
verifies that ancilla. Therefore, in scenario to, we sum coherently over all fault paths such that there is at 
least one fault in each of the first to — 1 ancilla preparation/verification circuits, and at least two faults in the 
gadget using ancilla to. This sum includes all of the fault paths that contribute to the badness of the gadget 
under scenario to, but it also includes other fault paths that do not contribute to scenario to. However, since 
the scenarios are distinguishable, there is no harm in including these additional unwanted scenarios, if our 
goal is to obtain an upper bound on || |«'|^(2'^^^)) This coherent sum for each scenario can be estimated 
by the method described in Appendix A. One finds that, for gadgets such that the "postselection correction" 
to £*^^^ is small in the case of local stochastic noise, the correction is small for local coherent noise as well. 

In [23], the lower bound on the accuracy threshold Eq > 1.94 x 10^"* was established for local stochastic 
noise, based on a protocol with modest use of postselection. Though we have not done the calculation in 
detail, we expect that a similar estimate Eq ~ 10~^, based on the same protocol, also applies to the threshold 
noise strength for local coherent noise. (The argument in [9 achieves a higher threshold estimate for local 
stochastic noise, but uses a different method that is less easily adapted to the case of local coherent noise.) 

Of course, for the case of Gaussian noise, if gadgets include multiple parallel attempts to prepare and 
verify software, then all of these attempts should be included in the integral J.^ in our estimate of the 



noise strength in eq. (27 1 
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D. Other considerations 

It would be desirable to extend the derivation of our threshold result in several other directions. One 
possible approach is to allow the bath fluctuations to be weakly non-Gaussian by including small anharmonic 
corrections in the bath Hamiltonian Hb- But, though the effects of bath self-interactions can be analyzed 
perturbatively by standard methods, obtaining useful rigorous results summed to all orders of perturbation 
theory is not simple. Another worthy goal, already emphasized at the end of Sec. |IV| is to formulate a 
threshold condition less sensitive to the high-frequency fluctuations of the bath; i.e., to noise with a frequency 
large compared to the natural frequencies of the ideal system Hamiltonian Hs- In principle this might be 
done by "integrating out" high-frequency noise, obtaining an effective noise model with a lower frequency 
cutoff that faithfully reproduces the impact of the noise on the simulated computation. Making such an 
analysis rigorous is another challenging open problem. In the next Section, though, we will discuss one 
special case in which an improved threshold estimate less sensitive to high-frequency noise can be achieved. 

VII. DIAGONAL GAUSSIAN NOISE 

As we discussed in Sec. |VB 1| our general arguments do not place any constraints on the frequency 
spectrum of the "hybrid-picture" system operators. Therefore, we were forced to take the modulus of the 
bath two-point function in our estimate of the noise strength s. And as a result, our estimate has a sensitivity 
to high-frequency bath fluctuations that seems rather artificial. 

There is at least one case where we have much better analytic control over the time-dependence of the 
system operators, allowing us to obtain a better estimate of the noise strength that has milder sensitivity to 
high-frequency noise. That is the case of pure dephasing noise, which we will discuss now. 

In this noise model, the bath couples only to the z-components of the qubits, so that the system-bath 
Hamiltonian is 

HsB = ^^(^) ^ ^) ' C^^) 

X 

where (l){x,t) is a Gaussian bath variable with mean zero. To further simplify the discussion (whose purpose 
is merely illustrative anyway), we will also assume there are no multi-qubit correlations in the noise (even 
though this might not be an accurate description of the noise in multi-qubit gates). That is, we assume 
{4>{x, t)(f>(jj, s)) = for X y, so that in effect each qubit is coupled to its own independent bath. 

A scheme for fault-tolerant quantum computation customized for highly-biased noise dominated by de- 
phasing was formulated in [53] and further discussed in In this scheme, all gates are teleported. 
Furthermore the only fundamental operations used are single-qubit preparations, single-qubit measurements 
in the cr^^-eigenstate basis, and two-qubit controUed-phase (cphase) gates. A CPHASE gate is diagonal in 
the computational {i.e., (Xz-eigenstate) basis, with eigenvalues (1,1,1,-1). Thus it can be realized by a 
time-dependent two-qubit system Hamiltonian that is also diagonal: 

Hs = f{t) {a,® a;, -a,® I -Kg) a,) , (75) 

where J dtf(t) = 7r/4. This diagonal system Hamiltonian commutes with the system-bath Hamiltonian 
eq. ( |74[ ), whose action on the system qubits is diagonal. As in previous Sections, we model a noisy qubit 
preparation as an ideal preparation followed by interaction with the oscillator bath, and we model a noisy 
measurement as interaction with the bath followed by an ideal measurement. 

We can analyze the effect of the noise on the computation using interaction-picture perturbation theory, 
and in fact we can estimate a probability (rather than an amplitude) for the outcome of a qubit measurement 
to differ from the measurement outcome in the ideal computation. For each qubit, we distinguish between 
good diagrams, in which the perturbation Hsb is inserted an even number of times in between the (ideal) 
qubit preparation and the (ideal) qubit measurement, and the bad diagrams, in which the perturbation is 
inserted an odd number of times in between the preparation and the measurement. Because az commutes 
with the ideal system Hamiltonian Hs, and because cr^ = /, in all good diagrams the outcome of the final 
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FIG. 6: The three connected Keldysh diagrams for a single qubit subject to Gaussian dephasing noise. The first two 
diagrams are "good" because is inserted an even number of times on each branch, and the third diagram is "bad" 
because CTz is inserted an odd number of times on each branch. 



(7x measurement agrees with the result in the ideal quantum circuit, while in bad diagrams the measurement 
outcome is flipped. 

Furthermore, the good and the bad part of the system-bath state I^'sb) are mutually orthogonal. To see 
this, imagine evaluating the inner product (^'f^'^l^ss') between the good and bad parts of the state for a 
single qubit and its associated bath. Since l^'g'^) has an odd number of perturbation insertions and l^'l'^"^) 
has an even number, each Keldysh diagram contributing to (^'I'^'^I^sb') is proportional to the expectation 
value of a product of an odd number of interaction-picture bath fields. All such diagrams vanish, since (l){x, t) 
is Gaussian with mean zero. Because the good and bad parts of I^'sb) are perfectly distinguishable, we can 
regard (^'sb'I^s'b) as the probability of error in the final qubit measurement. 

Let us compute this probability. The sum of all Keldysh diagrams (both good and bad) contributing 
to (^'sbI^S'b) for a single qubit is the exponential of the sum of "connected" diagrams. There are three 
connected diagrams, shown in Fig. [6j Thus 



1 = i^sBl'fsB) = exp {Cu + Cl + D) 



(76) 



here. 



Cu 



dt ds {(l){t)(l){s)) 



t>s 



(77) 



is the connected diagram in which two insertions on the upper Keldysh branch are contracted. 



Cl 



dt ds (0(t)0(s)) 



t<s 



is the connected diagram in which two insertions on the lower branch are contracted, and 



D 



dt ds m)<f>{s)) = -{Cu + Cl) 



(78) 



(79) 



is the connected diagram in which an insertion on the upper branch is contracted with an insertion on the 
lower branch. (In all three diagrams, the factor due to the expectation value of the product of system 
operators is simply 1.) When eq. (76) is expanded in powers of D, terms with an odd number of powers 
contribute to (^'c's l^'c's ), because Hsb is inserted an odd number of times on each branch, and terms with 



an even number of powers contribute to {9 
each branch. Thus, 



good|^good 



SB 



because Hsb is inserted an even number of times on 



^bad _ 
pgood 



(*sbI*sb> 



-D 



sinh D 
-D 



e coshD . 

If T is the elapsed time between the preparation and measurement of the qubit, then 

,2/ 



D 



duj ~ , ,4sin^(wr/2) 
27r 



(80) 



(81) 
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where 



/>00 T 

A{t - s) EE m)<p{s)) = / "^e— (*-^)A(a;) (82) 



oo 27r 



(we have assumed the noise is stationary). 



In the case of zero-temperature Ohmic noise, with A(w) given by eq. (17), we find that 



'j~< ''p / '2 '2 

D = - I dt I ds — ^ = A In \ X''" ) « 2A \n{T/T^) . (83) 



Jo 



{t- s - iTcY 



Thus the quantity D (an upper bound on the probabihty p^i^"^ of a measurement error) has only a mild 
logarithmic sensitivity to the ultraviolet cutoff t^^, in contrast to the power dependence on the cutoff found 



in eq. (43 1. This improvement occurs because D is found by integrating the bath two-point function A(t), 
rather than its modulus |A(i)|, which can be justified because the perturbation Hsb commutes with the 
ideal system Hamiltonian Hs- Even this logarithmic dependence on Tc may be spurious; it arises because we 
have assumed that the ideal qubit preparation (at time t = 0) and qubit measurement (at time t = T) are 
instantaneous. The divergence would be softened further if we used a smoother model of preparation and 
measurement. 

Perhaps the logarithmic dependence of the error probability on the elapsed time T should not be taken too 
seriously; it applies only if the noise spectrum is Ohmic down to a frequency of order . Let us nevertheless 
pursue the implications of this behavior. The crux of the scheme formulated in [53] is a teleported logical 
CNOT gate protected against dephasing by an n-qubit repetition code (where n is odd). This CNOT gadget 
contains four logical measurements, each of which is decoded by a majority vote. Furthermore, for each 
qubit, there are at most 3n time steps (each of duration to) in between the preparation and measurement 
of the qubit, where a CPHASE gate acts on the qubit in each step. Therefore, the probability ecNOT of an 
encoded error in this CNOT gadget can be bounded as 

ecNOT<4(^^)(P'^^'i)"*' , (84) 

where 

pbad < £) < 2.41n((3n-|-2)io/rc) (85) 

(here we have allowed the noise to act for a time to during each CPHASE gate and also during the initial 
preparation and final measurement). Hence the logical CNOT gate is well protected if A is small and to/rc is 
not too large. 

While the underlying noise model is Gaussian dephasing noise, under our assumptions the effective noise 
model for the CNOT gadgets is independent stochastic noise. Just to be specific, suppose that A = 10~^ and 



to/Tc = 10 . Then, for code length n = 9, eq. (84l yields Ecnot < 1-85 x 10 ^. This CNOT error rate is well 



below the accuracy threshold for the local stochastic noise model, indicating that these logical CNOT gates 
are adequate for scalable quantum computing. 



VIII. CONCLUSIONS 



The quantum accuracy threshold theorem indicates that scalable quantum computing is feasible in princi- 
ple. But will fault-tolerant quantum computation really work? One concern is that the noise models assumed 
by theorists are highly idealized, at best crude approximations to the noise in actual devices. In formulating 
these models, one desires on the one hand to capture essential features of realistic noise, but on the other 
hand to allow a succinct and elegant analysis of the computation's reliability. 

Seeking an appropriate balance between these two desiderata, we have proved in this paper a new version 
of the threshold theorem that applies to Gaussian quantum noise, which is physically well motivated and 
analytically tractable. Our result shows that quantum computing is scalable if the noise power spectrum 
obeys a certain condition. Compared to previous results regarding the effectiveness of fault-tolerant methods 
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against non-Markovian noise [5l|6l|7], our threshold condition has two advantages: it is expressed in terms 
of experimentally observable features of the noise, and it is less sensitive to high-frequency noise. 

As mentioned in Sec. |VI[ it might be useful to extend our results by relaxing the noise model in several 
ways, for example by including weak non-Gaussian corrections to the bath fluctuations, or by modeling more 
realistically the dissipative flow of heat from system to bath. It should also be possible to make further 
improvements in the sensitivity of the threshold condition to high-frequency noise; however, an improved 
condition would be likely to depend on the details of the frequency spectrum of the ideal system dynamics, 
and deriving it would require a more complicated analysis. 

Experimenters tend to worry less about high-frequency noise than about low-frequency noise, particularly 
1// dephasing noise. We anticipate that low frequency noise in quantum gates can be suppressed substan- 
tially through clever design of pulse sequences, leaving weak residual noise to be tamed via the fault-tolerant 
methods we have studied here. Joining pulse shaping methods with fault-tolerant circuit construction will 
be a fruitful topic for future research. 
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APPENDIX A: THRESHOLD THEOREM FOR LOCAL NOISE 

Here we will briefly sketch the proof of the quantum accuracy threshold theorem for local noise, following 
the argument in |6j. 

We assume that the joint evolution of the quantum computer (the system S) and its environment (the 
bath B) is governed by the Hamiltonian H = Hs + Hg + Hgg, where the perturbation H^b is responsible 
for the deviation of the system from its ideal evolution. Although this framework can be generalized (as 



discussed in Sec. VI I, we also assume that the system qubits are initialized ideally in the pure state j^g) 
before the Hamiltonian evolution begins and measured ideally after it ends. Furthermore, the initial state 
of the bath is the pure state l^'^)- Just before the ideal measurements are performed on the system qubits, 
the joint state of the system and bath is I^'sb) = t^ssl^ss)' where Usb is the time-evolution operator 
determined by H, and l^'g^) ~ l^'g) ® l^s)- We obtain a fault-path expansion for j^'ss) by expanding 
Usb as a power series in Hsb, and for each term in this expansion we declare a level-0 circuit location to 
be bad if Hg b acts nontrivially somewhere within that location. 

For any specified set Xr of r locations in the circuit, we denote by [^^'^(1^)) the sum of all the terms in 
the fault-path expansion of j^'ss) such that all of these r locations are bad. The noise is local with strength 
e if 

m'^SBi^IrM^e^ ■ (Al) 

Our objective is to show that scalable quantum computing is possible provided that s < Eq, where Eq is (a 
lower bound on) the accuracy threshold. 

Suppose that a universal set of fault-tolerant level- 1 gadgets can be constructed such that a 1-gadget 
containing fewer than s faulty level-0 gates simulates the corresponding ideal 0-gate correctly. We can 
estimate the effective noise strength for a level-1 simulation using the following observation: Consider a set 
2 of level-0 locations in a quantum circuit. Then the sum of all fault paths such that at least s of the 
locations in the set 2 are faulty can be expressed as 

\^sb{> s faults in I)) ^Y.(~^y''{lZ\) [E ' (^2) 



where J2ie denotes the sum over all subsets of I that contain £ elements. Eq. (A2) follows from the 



"inclusion-exclusion principle" of combinatorics. For example, in the case s—1 it becomes 
|*5b(> 1 fault inX)) 

= ^ l^fiill)) I*5S (^2)) + E l*5S (^3)) - E l*5B (^4)) + • ' ' , (A3) 
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whose origin is easy to understand. The first term counts correctly each fault path with exactly one fault in 
X, but it double-counts each fault path with exactly two faults, and this over-counting is corrected by the 
second term. The first term counts three times each fault path with exactly three faults, and the second 
term subtracts these fault paths (2) times; this under-counting is corrected by the third term. And so on. 
The norm of the left-hand side of eq. ( A2 ) is bounded above by the sum of the norms of the terms on the 



right hand side. Using || < e and denoting |X| = A we find 



|^Sb(^ s faults in A locations))! 



< I ^ |£ 



00 
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E 
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J-s 



<C( ^ l£" 



(A4) 



here C is a constant satisfying ( > e*^"^^*^*^ for values of e in some specified range of interest, which typically 
can be chosen such that ^ is close to 1. 

In a level-1 simulation of a quantum circuit, let us say that a level-1 gadget is "bad" if it contains s or 
more bad level-0 gates, and let Jr^^ denote a set of r specified level-1 gadgets. We assume, for now, that 
these level-1 gadgets are nonoverlapping; i.e., that no 0-gate is contained in two different 1-gadgets. We 
denote by the sum of all terms in the perturbation expansion of l^'ss) such that all of the r 



1-gadgets in X, 
we find that 



(1) 



are bad. By performing an "inclusion-exclusion" sum independently inside each 1-gadget, 
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yI(l)<.jCI(l) I(r)«^CI(r) 



(A5) 



where T{j) denotes the set of 0-gates inside the 1-gadget j, for j S {1,2,3,..., r}, and denotes the 

sum over all subsets of that contain £j elements. 



The local noise condition < £^ imphes that 

r 

|||vI/^^i(X(l),,U...UX(r),J)|| 



(A6) 



As above, we can bound the norm of the left-hand side of eq. ( A5 1 by the sum of the norms of the terms on 



be bounded as in eq (A4|. We obtain 



the right-hand side. Using eq. ( A6 1, this upper bound factorizes into a product of r sums, each of which can 

(A7) 



\^%W))\\<Y[4' 



where 



and hence 



where 



e^'^'^ = max (sj-^'') • 



(AS) 



(A9) 



(AlO) 
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Here Aj = Q > exp {{Aj — s)e), and the maximum is over all 1-gadgets in the circuit. We can regard 

e^^^ as an effective noise strength for the level-1 circuit, which is conveniently expressed in the form 

£(1) = So {e/e,r , (All) 

where 

\ -i/(s-i) 



eo^min C , (A12) 



and the minimum is over all 1-gadgets in our universal set. 

Now let us say that a level-fc gadget is bad if it contains s or more bad (fc— l)-gadgets. The bound 
eq. (A8l on the norm of the sum |vE'g^(Xr^^)) over all fault paths that are bad at level 1 is of the same 



form as the bound eq. ( Al I on the norm of the sum over all fault paths that are bad at level 0, but with 
a "renormalized" value of the effective noise strength. This means that in a recursive simulation, in which 
fc-gadgets are constructed using the same circuits as 1-gadgets, but with each 0-gate in the 1-gadget replaced 
by a (A;— l)-gadget, we can use the same combinatoric argument again to estimate the effective noise strength 
at level fc. That is, suppose that 

mfM'-'^m < {s^'-'^y , (A13) 

where Jr'^ is any specified set of r (fc— l)-gadgets in a level-(fc— 1) simulation, and \"^^^g{2r'' ^^)) denotes 
the sum of all fault paths such that all r of the (fc— l)-gadgets in X^'^ are bad. Then we may infer that 

m'^SBi^'^m < {e'^'^y , (A14) 

where Xr'^-' is any specified set of r fc-gadgets in a level-fc simulation, |^'^'^(Xr '"'')) denotes the sum of all fault 

(k) 

paths such that all r of the fc-gadgets in Jr are bad, and 

e^'^/eo = [e'^'-'^/eoy - {e/sof . (A15) 
The fault-path expansion of a level-fc simulation with all together L fc-gadgets can be expressed as 

l*5B)-|*|r) + l*lB> , (A16) 

where l^'f™'^) is the sum of all fault paths such that every fc-gadget is good, and l^'s^) is the sum of all 



fault paths such that at least one fc-gadget is bad. Combining the s=l case of eq. (A4| with eq. (A14l, then, 
we see that 

<Lexp((L-l)e('=))£W , (A17) 

which is small for Le'^'^^ <^ 1. Furthermore, the arguments in [6 show that if l^'ss) is good then the 
probability distribution p(actuai) _ {pi'^'^*"'''-' } governing the measurement outcome for the logical system 
qubits (where Pa is the probability of the measurement outcome labeled by a) matches exactly the probability 
distribution j)(idcai) -j-^j, ^^ic measurement outcomes in the ideal computation. Therefore, the deviation of the 
p(actuai) fj.Qj-Q p(idcai) j^-^ ^^ic noisy simulation arises only from the small bad component of l^'ss); in fact in 
the norm this deviation can be bounded as 

g ^ llp(actual) _p(idcal)||^ ^ ^ |p^actual) _ pOdoal) | < 2|||^'^^^)|| . (Al8) 

a 

Therefore, for e < Eq, the noisy computation becomes highly reliable as the level fc of the simulation increases; 
thus £0 is a lower bound on the accuracy threshold for quantum computation. 

In [5] , two valuable extensions of this argument were formulated that are useful for pushing the threshold 
estimate £o higher. First, the argument can be applied to simulations where successive 1-gadgets overlap, 
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i.e., have 0-gates in common. By allowing the gadgets to overlap, we can justify the estimate eq. (A12| 
when using properly designed gadgets based on a quantum error-correcting code that can correct s— 1 errors 
in a code block. Second, we can refine the definition of badness, so that a 1-gadget with s or more faults is 
declared bad only if the faults occur at a "malignant" set of locations, i.e., only if the 1-gadget processes 
encoded information incorrectly because of the faults. For example, for gadgets that can correct one error 
(the s—2 case), our estimate of the level-1 effective noise strength improves to 

e(i) = Be^ + De^ < s^/eo , (A19) 

where B is the number of malignant pairs of fault locations in the 1-gadget (maximized over all 1-gadgets), 
De^ is a correction arising from summing contributions from fault paths with three of more faults in the 
1-gadget, and 

Sq-i = (l + ^/T+4D/B^) (A20) 
is our improved threshold estimate, found by solving the equation Beq + Deq = Eq. 



